perm filename CACHE.FIG[AM,DBL] blob
sn#435201 filedate 1979-04-24 generic text, type C, neo UTF8
COMMENT ā VALID 00003 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 STORAGE PRINCIPLES
C00010 00003 UPDATING PRINCIPLES
C00013 ENDMK
Cā;
STORAGE PRINCIPLES
------------------
When (and when {\it not} to)
Every time you have to call a lower order function.
You called a function and it took a long time to return.
The value returned has not changed since the last time you called it.
The amount of time to recompute the value would be very high.
E.g., after a lucky accident, where a repeat might
be very costly or even fail entirely to duplicate
your recent success at getting a solution.
If (freq of usage)x(avg cost to recompute)/(freq of change)
is high, then it pays off.
If the value is so critical that the cost-benefit criteria
(eg the resource limits are enormous) always favor
recomputation rather than chancing a blunder,
Then don't cache.
If the function evaluates as fast as the caching mechanism itself,
or if space is too tight, don't cache.
F is called with the same argument x very often.
Why
Temporal cost of recomputing is higher than spatial cost of storage.
Related to ``When" rules above; namely: in those situations, it's
cost-effective to store a cache.
Analogy to utility of hardware caching.
Ability to utilize cached values as usage data to induct upon.
Where
The value to be cached should have a precise storage place.
One extreme is an assertional database, with {\it no} structure.
We prefer to add slot and subslot structuring, with some
domain-specific tinkering of those two sets (especially
the former, the set of slots.)
Thus an extreme example of prime numbers should be stored
on a slot called extreme-examples, on a schema called primes.
At access time, we look in the precise spot(s) where X would
be, and if it isn't there, we compute and cache it right there.
At worst, there should be a precise {\it set} of places where X
might be, a well-defined path to search down for X (e.g.,
a conjecture about prime numbers would either be an Example of
Conjectures, or a Conjecture of Primes; if not in either of those
places.)
How
Before (re)computing F(x), the program presumably searched in the
``right place'' (see ``Where'' above) to find a cache of
F(x) if one existed. At that time, set a pointer to that
place, and when the value is found, place it there.
Called functions might suggest how to cache their value in higher
calling caches
E.g., ``my value changes often so compile my definition
and cache that, but don't waste resources caching this value.''
Cache should be transparent and discardable (should be able to throw
them all away if space is needed).
If there are multiple places to store it (e.g., you learn that
X is a PARENT of Y, but PARENT is a virtual slot defined as
MOTHER $\union$ FATHER; if the cache is to be discardable, then
we must store X under either Y's MOTHER or FATHER slots.)
Then here are some strategies one might use:
Store it redundantly in several (all possible) places
Pick one of the places at random and store it there
Look up information about each place, to decide
Pick one of the places accurately by reasoning, testing, etc.
What
Store the returned value in any case.
Good policy to store a description of the {\it call} which led to
this value being computed. E.g., make sure you
store the amount of resources allotted, the
time of day and the relative place
in the computation (e.g., task number), whether
the user was allowed to be queried, etc.
Along with the call, store data about the actual path to a solution.
This includes algorithms used, time it took to compute,
which other caches were used, etc.
It may be useful to store other intermediate results of the process
that led to the value's being computed successfully
(e.g., a trap that was found and may crop up again.)
This also includes storing some abstract, partially-evaluated
symbolic expressions which generalize the value and its
computation.
If possible, predict (criteria, time-limit, etc.) conditions under
which this will no longer be current. E.g., ``Depends
on the definition of Y being known, and on there being
no examples of Z known, and...'')
Include a discussion of the certainty of the result.
It may be worthwhile to {\it stack} the previous cached values,
enabling the program to determine if (when, how, how often)
they're changing.
If you choose not to cache, you may wish to store a marker recording
--- and hopefully explaining --- that decision.
How to eliminate caches
Just reverse the criteria in ``When" (above), to see if it
is no longer cost-effective to keep the cache.
Remember that F(x) will occupy varying amounts of space,
varying with F, x, and time.
AM, e.g., eliminated largest cached values first.
Eliminate least frequently (or least recently) used caches first.
In general, eliminate the caches that are least cost-effective.
UPDATING PRINCIPLES
-------------------
When
If the current request would have produced a different answer.
Usually, only do this if the current request will produce a
distinctly {\it better} answer. For instance:
If the current request specifies more precision in the answer.
If the resource allocation is high enough to get a better
or more current answer than the one cached already.
Trivially: If there is no cached answer available.
Typically: x or the definition of F changed since F(x) was cached.
Why
Frame problem: if the value is old, it's hard to know for sure that
the world hasn't changed since it was computed.
If new value is computed (either intentionlly or accidentally),
and its freshness (or the other conditions of its call)
makes it ``better" than the existing cached value.
How
Discard the old value, or remember it elsewhere (maybe on disk).
Can use demon traps that flag the cache as out of date, without
bothering to recompute the value if not required to.
If the new value is found, simply store it.
The user may want to know about the updating.
Many other cached values may depend upon this one, and they may have
to be ferreted out and updated also, or at least some
information should be left posted so they can later determine
that they may be out of date.
Usually: discard the whole old value, store just the new one.
If the program is very smart: When the world changes, try to
propogate that change through the network of caches,
changing {\it them}. This is much better than erasing them!